Multi-Cloud Workflow with Pangeo

This example demonstrates a workflow using analysis-ready data provided in two public clouds.

  • LENS (Hosted on AWS in the us-west-2 region)

  • ERA5 (Hosted on Google Cloud Platform in multiple regions)

We’ll perform a similar analysis on each of the datasets, a histogram of the total precipitation, compare the results. Notably, this computation reduces a large dataset to a small summary. The reduction can happen on a cluster in the cloud.

By placing a compute cluster in the cloud next to the data, we avoid moving large amounts of data over the public internet. The large analysis-ready data only needs to move within a cloud region: from the machines storing the data in an object-store like S3 to the machines performing the analysis. The compute cluster reduces the large amount of data to a small histogram summary. At just a handful of KBs, the summary statistics can easily be moved back to the local client, which might be running on a laptop. This also avoids costly egress charges from moving large amounts of data out of cloud regions.

import getpass

import dask
from distributed import Client
from dask_gateway import Gateway, BasicAuth
import intake
import numpy as np
import s3fs
import xarray as xr
from xhistogram.xarray import histogram

Create Dask Clusters

We’ve deployed Dask Gateway on two Kubernetes clusters, one in AWS and one in GCP. We’ll use these to create Dask clusters in the same cloud region as the data. We’ll connect to both of them from the same interactive notebook session.

password = getpass.getpass()
auth = BasicAuth("pangeo", password)
 ····
# Create a Dask Cluster on AWS
aws_gateway = Gateway(
    "http://a00670d37945911eab47102a1da71b1b-524946043.us-west-2.elb.amazonaws.com",
    auth=auth,
)
aws = aws_gateway.new_cluster()
aws_client = Client(aws, set_as_default=False)
aws_client

Client

Cluster

  • Workers: 0
  • Cores: 0
  • Memory: 0 B
# Create a Dask Cluster on GCP
gcp_gateway = Gateway(
    "http://34.72.56.89",
    auth=auth,
)
gcp = gcp_gateway.new_cluster()
gcp_client = Client(gcp, set_as_default=False)
gcp_client

Client

Cluster

  • Workers: 0
  • Cores: 0
  • Memory: 0 B

We’ll enable adaptive mode on each of the Dask clusters. Workers will be added and removed as needed by the current level of computation.

aws.adapt(minimum=1, maximum=200)
gcp.adapt(minimum=1, maximum=200)

ERA5 on Google Cloud Storage

We’ll use intake and pangeo’s data catalog to discover the dataset.

cat = intake.open_catalog(
    "https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml"
)
cat
<Intake catalog: master>

The next cell loads the metadata as an xarray dataset. No large amount of data is read or transfered here. It will be loaded on-demand when we ask for a concrete result later.

era5 = cat.atmosphere.era5_hourly_reanalysis_single_levels_sa(
    storage_options={"requester_pays": False, "token": "anon"}
).to_dask()
era5
Show/Hide data repr Show/Hide attributes
xarray.Dataset
    • latitude: 721
    • longitude: 1440
    • time: 350640
    • latitude
      (latitude)
      float32
      90.0 89.75 89.5 ... -89.75 -90.0
      long_name :
      latitude
      units :
      degrees_north
      array([ 90.  ,  89.75,  89.5 , ..., -89.5 , -89.75, -90.  ], dtype=float32)
    • longitude
      (longitude)
      float32
      0.0 0.25 0.5 ... 359.5 359.75
      long_name :
      longitude
      units :
      degrees_east
      array([0.0000e+00, 2.5000e-01, 5.0000e-01, ..., 3.5925e+02, 3.5950e+02,
             3.5975e+02], dtype=float32)
    • time
      (time)
      datetime64[ns]
      1979-01-01 ... 2018-12-31T23:00:00
      long_name :
      time
      array(['1979-01-01T00:00:00.000000000', '1979-01-01T01:00:00.000000000',
             '1979-01-01T02:00:00.000000000', ..., '2018-12-31T21:00:00.000000000',
             '2018-12-31T22:00:00.000000000', '2018-12-31T23:00:00.000000000'],
            dtype='datetime64[ns]')
    • asn
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Snow albedo
      units :
      (0 - 1)
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • d2m
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      2 metre dewpoint temperature
      units :
      K
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • e
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Evaporation
      standard_name :
      lwe_thickness_of_water_evaporation_amount
      units :
      m of water equivalent
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • mn2t
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Minimum temperature at 2 metres since previous post-processing
      units :
      K
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • mx2t
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Maximum temperature at 2 metres since previous post-processing
      units :
      K
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • ptype
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Precipitation type
      units :
      code table (4.201)
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • ro
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Runoff
      units :
      m
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • sd
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Snow depth
      standard_name :
      lwe_thickness_of_surface_snow_amount
      units :
      m of water equivalent
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • sro
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Surface runoff
      units :
      m
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • ssr
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Surface net solar radiation
      standard_name :
      surface_net_downward_shortwave_flux
      units :
      J m**-2
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • t2m
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      2 metre temperature
      units :
      K
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • tcc
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Total cloud cover
      standard_name :
      cloud_area_fraction
      units :
      (0 - 1)
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • tcrw
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Total column rain water
      units :
      kg m**-2
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • tp
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Total precipitation
      units :
      m
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • tsn
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      Temperature of snow layer
      standard_name :
      temperature_in_surface_snow
      units :
      K
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • u10
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      10 metre U wind component
      units :
      m s**-1
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
    • v10
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
      long_name :
      10 metre V wind component
      units :
      m s**-1
      Array Chunk
      Bytes 1.46 TB 128.74 MB
      Shape (350640, 721, 1440) (31, 721, 1440)
      Count 11312 Tasks 11311 Chunks
      Type float32 numpy.ndarray
      1440 721 350640
  • Conventions :
    CF-1.6
    history :
    2019-09-20 05:15:01 GMT by grib_to_netcdf-2.10.0: /opt/ecmwf/eccodes/bin/grib_to_netcdf -o /cache/data7/adaptor.mars.internal-1568954670.105603-18230-3-5ca6e0df-a562-42ba-8b0b-948b2e1815bd.nc /cache/tmp/5ca6e0df-a562-42ba-8b0b-948b2e1815bd-adaptor.mars.internal-1568954670.1062171-18230-2-tmp.grib

We’re computing the histogram on the total precipitation for a specific time period. xarray makes selecting this subset of data quite natural. Again, we still haven’t loaded the data.

tp = era5.tp.sel(time=slice('1990-01-01', '2005-12-31'))
tp
Show/Hide data repr Show/Hide attributes
xarray.DataArray
'tp'
  • time: 140256
  • latitude: 721
  • longitude: 1440
  • dask.array<chunksize=(9, 721, 1440), meta=np.ndarray>
    Array Chunk
    Bytes 582.48 GB 128.74 MB
    Shape (140256, 721, 1440) (31, 721, 1440)
    Count 15838 Tasks 4526 Chunks
    Type float32 numpy.ndarray
    1440 721 140256
    • latitude
      (latitude)
      float32
      90.0 89.75 89.5 ... -89.75 -90.0
      long_name :
      latitude
      units :
      degrees_north
      array([ 90.  ,  89.75,  89.5 , ..., -89.5 , -89.75, -90.  ], dtype=float32)
    • longitude
      (longitude)
      float32
      0.0 0.25 0.5 ... 359.5 359.75
      long_name :
      longitude
      units :
      degrees_east
      array([0.0000e+00, 2.5000e-01, 5.0000e-01, ..., 3.5925e+02, 3.5950e+02,
             3.5975e+02], dtype=float32)
    • time
      (time)
      datetime64[ns]
      1990-01-01 ... 2005-12-31T23:00:00
      long_name :
      time
      array(['1990-01-01T00:00:00.000000000', '1990-01-01T01:00:00.000000000',
             '1990-01-01T02:00:00.000000000', ..., '2005-12-31T21:00:00.000000000',
             '2005-12-31T22:00:00.000000000', '2005-12-31T23:00:00.000000000'],
            dtype='datetime64[ns]')
  • long_name :
    Total precipitation
    units :
    m

To compare to the 6-hourly LENS dataset, we’ll aggregate to 6-hourly totals.

# convert to 6-hourly precip totals
tp_6hr = tp.coarsen(time=6).sum()
tp_6hr
Show/Hide data repr Show/Hide attributes
xarray.DataArray
  • time: 23376
  • latitude: 721
  • longitude: 1440
  • dask.array<chunksize=(1, 721, 1440), meta=np.ndarray>
    Array Chunk
    Bytes 97.08 GB 24.92 MB
    Shape (23376, 721, 1440) (6, 721, 1440)
    Count 50536 Tasks 4526 Chunks
    Type float32 numpy.ndarray
    1440 721 23376
    • latitude
      (latitude)
      float32
      90.0 89.75 89.5 ... -89.75 -90.0
      long_name :
      latitude
      units :
      degrees_north
      array([ 90.  ,  89.75,  89.5 , ..., -89.5 , -89.75, -90.  ], dtype=float32)
    • longitude
      (longitude)
      float32
      0.0 0.25 0.5 ... 359.5 359.75
      long_name :
      longitude
      units :
      degrees_east
      array([0.0000e+00, 2.5000e-01, 5.0000e-01, ..., 3.5925e+02, 3.5950e+02,
             3.5975e+02], dtype=float32)
    • time
      (time)
      datetime64[ns]
      1990-01-01T02:30:00 ... 2005-12-31T20:30:00
      array(['1990-01-01T02:30:00.000000000', '1990-01-01T08:30:00.000000000',
             '1990-01-01T14:30:00.000000000', ..., '2005-12-31T08:30:00.000000000',
             '2005-12-31T14:30:00.000000000', '2005-12-31T20:30:00.000000000'],
            dtype='datetime64[ns]')

We’ll bin this data into the following bins.

tp_6hr_bins = np.concatenate([[0], np.logspace(-5,  0, 50)])
tp_6hr_bins
array([0.00000000e+00, 1.00000000e-05, 1.26485522e-05, 1.59985872e-05,
       2.02358965e-05, 2.55954792e-05, 3.23745754e-05, 4.09491506e-05,
       5.17947468e-05, 6.55128557e-05, 8.28642773e-05, 1.04811313e-04,
       1.32571137e-04, 1.67683294e-04, 2.12095089e-04, 2.68269580e-04,
       3.39322177e-04, 4.29193426e-04, 5.42867544e-04, 6.86648845e-04,
       8.68511374e-04, 1.09854114e-03, 1.38949549e-03, 1.75751062e-03,
       2.22299648e-03, 2.81176870e-03, 3.55648031e-03, 4.49843267e-03,
       5.68986603e-03, 7.19685673e-03, 9.10298178e-03, 1.15139540e-02,
       1.45634848e-02, 1.84206997e-02, 2.32995181e-02, 2.94705170e-02,
       3.72759372e-02, 4.71486636e-02, 5.96362332e-02, 7.54312006e-02,
       9.54095476e-02, 1.20679264e-01, 1.52641797e-01, 1.93069773e-01,
       2.44205309e-01, 3.08884360e-01, 3.90693994e-01, 4.94171336e-01,
       6.25055193e-01, 7.90604321e-01, 1.00000000e+00])

The next cell applies the histogram to the longitude dimension and takes the mean over time. We’re still just building up the computation here, we haven’t actually loaded the data or executed it yet.

tp_hist = histogram(
    tp_6hr.rename('tp_6hr'), bins=[tp_6hr_bins], dim=['longitude']
).mean(dim='time')
tp_hist.data
Array Chunk
Bytes 288.40 kB 288.40 kB
Shape (721, 50) (721, 50)
Count 110889 Tasks 1 Chunks
Type float64 numpy.ndarray
50 721

In total, we’re going from the ~1.5TB raw dataset down to a small 288 kB result that is the histogram summarizing the total precipitation. We’ve built up a large sequence of operations to do that reduction (over 110,000 individual tasks), and now it’s time to actually execute it. There will be some delay between running the next cell, the scheduler receiving the task graph, and the cluster starting to process it, but work is happening in the background. After a minute or so, tasks will start appearing on the Dask dashboard.

One thing to note: we request this result with the gcp_client, the client for the cluster in the same cloud region as the data.

era5_tp_hist_ = gcp_client.compute(tp_hist, retries=5)

gcp_tp_hist_ is a Future pointing to the result on the cluster. The actual computation is happening in the background, and we’ll call .result() to get the concrete result later on.

era5_tp_hist_
Future: finalize status: pending, key: finalize-827454f3f45ccd1c7c22f0b3907c098c

Because the Dask cluster is in adaptive mode, this computation has kicked off a chain of events: Dask has noticed that it suddenly has many tasks to compute, so it asks the cluster manager (Kubernetes in this case) for more workers. THe Kubernetes cluster then asks it’s compute backend (Google Compute Engine in this case) for more virtual machines. As these machines come online, our workers will come to life and the cluster will start progressing on our computation.

LENS on AWS

This computation is very similar to the ERA5 computation. The primary difference is that the LENS dataset is an ensemble. We’ll histogram a single member of that ensemble.

The Intake catalog created by NCAR includes many things, so we’ll use intake-esm to search for the URL we want.

col = intake.open_esm_datastore(
    "https://raw.githubusercontent.com/NCAR/cesm-lens-aws/master/intake-catalogs/aws-cesm1-le.json"
)
res = col.search(frequency='hourly6-1990-2005', variable='PRECT')
res.df
component frequency experiment variable path
0 atm hourly6-1990-2005 20C PRECT s3://ncar-cesm-lens/atm/hourly6-1990-2005/cesm...
url = res.df.loc[0, "path"]
url
's3://ncar-cesm-lens/atm/hourly6-1990-2005/cesmLE-20C-PRECT.zarr'

We’ll (lazily) load that data from S3 using s3fs, xarray, and zarr.

fs = s3fs.S3FileSystem(anon=True)
lens = xr.open_zarr(fs.get_mapper(url), consolidated=True)
lens
Show/Hide data repr Show/Hide attributes
xarray.Dataset
    • ilev: 31
    • lat: 192
    • lev: 30
    • lon: 288
    • member_id: 36
    • nbnd: 2
    • slat: 191
    • slon: 288
    • time: 23360
    • ilev
      (ilev)
      float64
      2.255 5.032 10.16 ... 985.1 1e+03
      formula_terms :
      a: hyai b: hybi p0: P0 ps: PS
      long_name :
      hybrid level at interfaces (1000*(A+B))
      positive :
      down
      standard_name :
      atmosphere_hybrid_sigma_pressure_coordinate
      units :
      level
      array([   2.25524 ,    5.031692,   10.157947,   18.555317,   30.669123,
               45.867477,   63.323483,   80.701418,   94.941042,  111.693211,
              131.401271,  154.586807,  181.863353,  213.952821,  251.704417,
              296.117216,  348.366588,  409.835219,  482.149929,  567.224421,
              652.332969,  730.445892,  796.363071,  845.353667,  873.715866,
              900.324631,  924.964462,  947.432335,  967.538625,  985.11219 ,
             1000.      ])
    • lat
      (lat)
      float64
      -90.0 -89.06 -88.12 ... 89.06 90.0
      long_name :
      latitude
      units :
      degrees_north
      array([-90.      , -89.057592, -88.115183, -87.172775, -86.230366, -85.287958,
             -84.34555 , -83.403141, -82.460733, -81.518325, -80.575916, -79.633508,
             -78.691099, -77.748691, -76.806283, -75.863874, -74.921466, -73.979058,
             -73.036649, -72.094241, -71.151832, -70.209424, -69.267016, -68.324607,
             -67.382199, -66.439791, -65.497382, -64.554974, -63.612565, -62.670157,
             -61.727749, -60.78534 , -59.842932, -58.900524, -57.958115, -57.015707,
             -56.073298, -55.13089 , -54.188482, -53.246073, -52.303665, -51.361257,
             -50.418848, -49.47644 , -48.534031, -47.591623, -46.649215, -45.706806,
             -44.764398, -43.82199 , -42.879581, -41.937173, -40.994764, -40.052356,
             -39.109948, -38.167539, -37.225131, -36.282723, -35.340314, -34.397906,
             -33.455497, -32.513089, -31.570681, -30.628272, -29.685864, -28.743455,
             -27.801047, -26.858639, -25.91623 , -24.973822, -24.031414, -23.089005,
             -22.146597, -21.204188, -20.26178 , -19.319372, -18.376963, -17.434555,
             -16.492147, -15.549738, -14.60733 , -13.664921, -12.722513, -11.780105,
             -10.837696,  -9.895288,  -8.95288 ,  -8.010471,  -7.068063,  -6.125654,
              -5.183246,  -4.240838,  -3.298429,  -2.356021,  -1.413613,  -0.471204,
               0.471204,   1.413613,   2.356021,   3.298429,   4.240838,   5.183246,
               6.125654,   7.068063,   8.010471,   8.95288 ,   9.895288,  10.837696,
              11.780105,  12.722513,  13.664921,  14.60733 ,  15.549738,  16.492147,
              17.434555,  18.376963,  19.319372,  20.26178 ,  21.204188,  22.146597,
              23.089005,  24.031414,  24.973822,  25.91623 ,  26.858639,  27.801047,
              28.743455,  29.685864,  30.628272,  31.570681,  32.513089,  33.455497,
              34.397906,  35.340314,  36.282723,  37.225131,  38.167539,  39.109948,
              40.052356,  40.994764,  41.937173,  42.879581,  43.82199 ,  44.764398,
              45.706806,  46.649215,  47.591623,  48.534031,  49.47644 ,  50.418848,
              51.361257,  52.303665,  53.246073,  54.188482,  55.13089 ,  56.073298,
              57.015707,  57.958115,  58.900524,  59.842932,  60.78534 ,  61.727749,
              62.670157,  63.612565,  64.554974,  65.497382,  66.439791,  67.382199,
              68.324607,  69.267016,  70.209424,  71.151832,  72.094241,  73.036649,
              73.979058,  74.921466,  75.863874,  76.806283,  77.748691,  78.691099,
              79.633508,  80.575916,  81.518325,  82.460733,  83.403141,  84.34555 ,
              85.287958,  86.230366,  87.172775,  88.115183,  89.057592,  90.      ])
    • lev
      (lev)
      float64
      3.643 7.595 14.36 ... 976.3 992.6
      formula_terms :
      a: hyam b: hybm p0: P0 ps: PS
      long_name :
      hybrid level at midpoints (1000*(A+B))
      positive :
      down
      standard_name :
      atmosphere_hybrid_sigma_pressure_coordinate
      units :
      level
      array([  3.643466,   7.59482 ,  14.356632,  24.61222 ,  38.2683  ,  54.59548 ,
              72.012451,  87.82123 , 103.317127, 121.547241, 142.994039, 168.22508 ,
             197.908087, 232.828619, 273.910817, 322.241902, 379.100904, 445.992574,
             524.687175, 609.778695, 691.38943 , 763.404481, 820.858369, 859.534767,
             887.020249, 912.644547, 936.198398, 957.48548 , 976.325407, 992.556095])
    • lon
      (lon)
      float64
      0.0 1.25 2.5 ... 356.2 357.5 358.8
      long_name :
      longitude
      units :
      degrees_east
      array([  0.  ,   1.25,   2.5 , ..., 356.25, 357.5 , 358.75])
    • member_id
      (member_id)
      int64
      1 2 3 4 5 6 ... 31 32 33 34 35 104
      array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,  14,
              15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,  27,  28,
              29,  30,  31,  32,  33,  34,  35, 104])
    • slat
      (slat)
      float64
      -89.53 -88.59 ... 88.59 89.53
      long_name :
      staggered latitude
      units :
      degrees_north
      array([-89.528796, -88.586387, -87.643979, -86.701571, -85.759162, -84.816754,
             -83.874346, -82.931937, -81.989529, -81.04712 , -80.104712, -79.162304,
             -78.219895, -77.277487, -76.335079, -75.39267 , -74.450262, -73.507853,
             -72.565445, -71.623037, -70.680628, -69.73822 , -68.795812, -67.853403,
             -66.910995, -65.968586, -65.026178, -64.08377 , -63.141361, -62.198953,
             -61.256545, -60.314136, -59.371728, -58.429319, -57.486911, -56.544503,
             -55.602094, -54.659686, -53.717277, -52.774869, -51.832461, -50.890052,
             -49.947644, -49.005236, -48.062827, -47.120419, -46.17801 , -45.235602,
             -44.293194, -43.350785, -42.408377, -41.465969, -40.52356 , -39.581152,
             -38.638743, -37.696335, -36.753927, -35.811518, -34.86911 , -33.926702,
             -32.984293, -32.041885, -31.099476, -30.157068, -29.21466 , -28.272251,
             -27.329843, -26.387435, -25.445026, -24.502618, -23.560209, -22.617801,
             -21.675393, -20.732984, -19.790576, -18.848168, -17.905759, -16.963351,
             -16.020942, -15.078534, -14.136126, -13.193717, -12.251309, -11.308901,
             -10.366492,  -9.424084,  -8.481675,  -7.539267,  -6.596859,  -5.65445 ,
              -4.712042,  -3.769634,  -2.827225,  -1.884817,  -0.942408,   0.      ,
               0.942408,   1.884817,   2.827225,   3.769634,   4.712042,   5.65445 ,
               6.596859,   7.539267,   8.481675,   9.424084,  10.366492,  11.308901,
              12.251309,  13.193717,  14.136126,  15.078534,  16.020942,  16.963351,
              17.905759,  18.848168,  19.790576,  20.732984,  21.675393,  22.617801,
              23.560209,  24.502618,  25.445026,  26.387435,  27.329843,  28.272251,
              29.21466 ,  30.157068,  31.099476,  32.041885,  32.984293,  33.926702,
              34.86911 ,  35.811518,  36.753927,  37.696335,  38.638743,  39.581152,
              40.52356 ,  41.465969,  42.408377,  43.350785,  44.293194,  45.235602,
              46.17801 ,  47.120419,  48.062827,  49.005236,  49.947644,  50.890052,
              51.832461,  52.774869,  53.717277,  54.659686,  55.602094,  56.544503,
              57.486911,  58.429319,  59.371728,  60.314136,  61.256545,  62.198953,
              63.141361,  64.08377 ,  65.026178,  65.968586,  66.910995,  67.853403,
              68.795812,  69.73822 ,  70.680628,  71.623037,  72.565445,  73.507853,
              74.450262,  75.39267 ,  76.335079,  77.277487,  78.219895,  79.162304,
              80.104712,  81.04712 ,  81.989529,  82.931937,  83.874346,  84.816754,
              85.759162,  86.701571,  87.643979,  88.586387,  89.528796])
    • slon
      (slon)
      float64
      -0.625 0.625 1.875 ... 356.9 358.1
      long_name :
      staggered longitude
      units :
      degrees_east
      array([ -0.625,   0.625,   1.875, ..., 355.625, 356.875, 358.125])
    • time
      (time)
      object
      1990-01-01 06:00:00 ... 2006-01-01 00:00:00
      bounds :
      time_bnds
      long_name :
      time
      array([cftime.DatetimeNoLeap(1990-01-01 06:00:00),
             cftime.DatetimeNoLeap(1990-01-01 12:00:00),
             cftime.DatetimeNoLeap(1990-01-01 18:00:00), ...,
             cftime.DatetimeNoLeap(2005-12-31 12:00:00),
             cftime.DatetimeNoLeap(2005-12-31 18:00:00),
             cftime.DatetimeNoLeap(2006-01-01 00:00:00)], dtype=object)
    • P0
      ()
      float64
      ...
      long_name :
      reference pressure
      units :
      Pa
      array(100000.)
    • PRECT
      (member_id, time, lat, lon)
      float32
      dask.array<chunksize=(2, 504, 192, 288), meta=np.ndarray>
      cell_methods :
      time: mean
      long_name :
      Total (convective and large-scale) precipitation rate (liq + ice)
      units :
      m/s
      Array Chunk
      Bytes 186.01 GB 222.95 MB
      Shape (36, 23360, 192, 288) (2, 504, 192, 288)
      Count 847 Tasks 846 Chunks
      Type float32 numpy.ndarray
      36 1 288 192 23360
    • area
      (lat, lon)
      float32
      dask.array<chunksize=(192, 288), meta=np.ndarray>
      long_name :
      Grid-Cell Area
      standard_name :
      cell_area
      units :
      m2
      Array Chunk
      Bytes 221.18 kB 221.18 kB
      Shape (192, 288) (192, 288)
      Count 2 Tasks 1 Chunks
      Type float32 numpy.ndarray
      288 192
    • ch4vmr
      (time)
      float64
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      ch4 volume mixing ratio
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type float64 numpy.ndarray
      23360 1
    • co2vmr
      (time)
      float64
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      co2 volume mixing ratio
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type float64 numpy.ndarray
      23360 1
    • date
      (time)
      int32
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      current date (YYYYMMDD)
      Array Chunk
      Bytes 93.44 kB 2.02 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type int32 numpy.ndarray
      23360 1
    • date_written
      (time)
      |S8
      dask.array<chunksize=(504,), meta=np.ndarray>
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type |S8 numpy.ndarray
      23360 1
    • datesec
      (time)
      int32
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      current seconds of current date
      Array Chunk
      Bytes 93.44 kB 2.02 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type int32 numpy.ndarray
      23360 1
    • f11vmr
      (time)
      float64
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      f11 volume mixing ratio
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type float64 numpy.ndarray
      23360 1
    • f12vmr
      (time)
      float64
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      f12 volume mixing ratio
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type float64 numpy.ndarray
      23360 1
    • gw
      (lat)
      float64
      dask.array<chunksize=(192,), meta=np.ndarray>
      long_name :
      gauss weights
      Array Chunk
      Bytes 1.54 kB 1.54 kB
      Shape (192,) (192,)
      Count 2 Tasks 1 Chunks
      Type float64 numpy.ndarray
      192 1
    • hyai
      (ilev)
      float64
      dask.array<chunksize=(31,), meta=np.ndarray>
      long_name :
      hybrid A coefficient at layer interfaces
      Array Chunk
      Bytes 248 B 248 B
      Shape (31,) (31,)
      Count 2 Tasks 1 Chunks
      Type float64 numpy.ndarray
      31 1
    • hyam
      (lev)
      float64
      dask.array<chunksize=(30,), meta=np.ndarray>
      long_name :
      hybrid A coefficient at layer midpoints
      Array Chunk
      Bytes 240 B 240 B
      Shape (30,) (30,)
      Count 2 Tasks 1 Chunks
      Type float64 numpy.ndarray
      30 1
    • hybi
      (ilev)
      float64
      dask.array<chunksize=(31,), meta=np.ndarray>
      long_name :
      hybrid B coefficient at layer interfaces
      Array Chunk
      Bytes 248 B 248 B
      Shape (31,) (31,)
      Count 2 Tasks 1 Chunks
      Type float64 numpy.ndarray
      31 1
    • hybm
      (lev)
      float64
      dask.array<chunksize=(30,), meta=np.ndarray>
      long_name :
      hybrid B coefficient at layer midpoints
      Array Chunk
      Bytes 240 B 240 B
      Shape (30,) (30,)
      Count 2 Tasks 1 Chunks
      Type float64 numpy.ndarray
      30 1
    • mdt
      ()
      int32
      ...
      long_name :
      timestep
      units :
      s
      array(1800, dtype=int32)
    • n2ovmr
      (time)
      float64
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      n2o volume mixing ratio
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type float64 numpy.ndarray
      23360 1
    • nbdate
      ()
      int32
      ...
      long_name :
      base date (YYYYMMDD)
      array(18500101, dtype=int32)
    • nbsec
      ()
      int32
      ...
      long_name :
      seconds of base date
      array(0, dtype=int32)
    • ndbase
      ()
      int32
      ...
      long_name :
      base day
      array(0, dtype=int32)
    • ndcur
      (time)
      int32
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      current day (from base day)
      Array Chunk
      Bytes 93.44 kB 2.02 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type int32 numpy.ndarray
      23360 1
    • nlon
      (lat)
      int32
      dask.array<chunksize=(192,), meta=np.ndarray>
      long_name :
      number of longitudes
      Array Chunk
      Bytes 768 B 768 B
      Shape (192,) (192,)
      Count 2 Tasks 1 Chunks
      Type int32 numpy.ndarray
      192 1
    • nsbase
      ()
      int32
      ...
      long_name :
      seconds of base day
      array(0, dtype=int32)
    • nscur
      (time)
      int32
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      current seconds of current day
      Array Chunk
      Bytes 93.44 kB 2.02 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type int32 numpy.ndarray
      23360 1
    • nsteph
      (time)
      int32
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      current timestep
      Array Chunk
      Bytes 93.44 kB 2.02 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type int32 numpy.ndarray
      23360 1
    • ntrk
      ()
      int32
      ...
      long_name :
      spectral truncation parameter K
      array(1, dtype=int32)
    • ntrm
      ()
      int32
      ...
      long_name :
      spectral truncation parameter M
      array(1, dtype=int32)
    • ntrn
      ()
      int32
      ...
      long_name :
      spectral truncation parameter N
      array(1, dtype=int32)
    • sol_tsi
      (time)
      float64
      dask.array<chunksize=(504,), meta=np.ndarray>
      long_name :
      total solar irradiance
      units :
      W/m2
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type float64 numpy.ndarray
      23360 1
    • time_bnds
      (time, nbnd)
      object
      dask.array<chunksize=(11680, 2), meta=np.ndarray>
      long_name :
      time interval endpoints
      Array Chunk
      Bytes 373.76 kB 186.88 kB
      Shape (23360, 2) (11680, 2)
      Count 3 Tasks 2 Chunks
      Type object numpy.ndarray
      2 23360
    • time_written
      (time)
      |S8
      dask.array<chunksize=(504,), meta=np.ndarray>
      Array Chunk
      Bytes 186.88 kB 4.03 kB
      Shape (23360,) (504,)
      Count 48 Tasks 47 Chunks
      Type |S8 numpy.ndarray
      23360 1
    • w_stag
      (slat)
      float64
      dask.array<chunksize=(191,), meta=np.ndarray>
      long_name :
      staggered latitude weights
      Array Chunk
      Bytes 1.53 kB 1.53 kB
      Shape (191,) (191,)
      Count 2 Tasks 1 Chunks
      Type float64 numpy.ndarray
      191 1
    • wnummax
      (lat)
      int32
      dask.array<chunksize=(192,), meta=np.ndarray>
      long_name :
      cutoff Fourier wavenumber
      Array Chunk
      Bytes 768 B 768 B
      Shape (192,) (192,)
      Count 2 Tasks 1 Chunks
      Type int32 numpy.ndarray
      192 1
  • Conventions :
    CF-1.0
    NCO :
    4.3.4
    Version :
    $Name$
    history :
    2019-08-01 00:15:18.487461 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.001.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:19.080785 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.002.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:20.252396 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.003.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:20.787281 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.004.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:21.279874 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.005.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:21.850205 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.006.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:22.423595 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.007.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:23.127816 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.008.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:23.695110 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.009.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:24.291352 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.010.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:24.873420 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.011.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:25.512516 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.012.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:26.061289 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.013.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:26.662665 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.014.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:27.243923 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.015.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:27.799712 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.016.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:28.350833 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.017.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:28.882690 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.018.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:29.612376 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.019.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:30.142923 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.020.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:32.677487 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.021.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:33.314355 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.022.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:35.416995 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.023.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:37.400624 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.024.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:40.313590 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.025.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:42.594527 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.026.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:44.729537 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.027.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:46.637571 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.028.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:48.589381 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.029.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:50.705311 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.030.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:51.206031 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.031.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:51.683246 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.032.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:52.156426 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.033.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:54.220732 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.034.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:56.793005 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.035.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:58.913802 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.104.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:16:06.104904 xarray.concat(<ALL_MEMBERS>, dim='member_id', coords='minimal')
    important_note :
    This data is part of the project 'Blind Evaluation of Lossy Data-Compression in LENS'. Please exercise caution before using this data for other purposes.
    nco_openmp_thread_number :
    1
    revision_Id :
    $Id$
    source :
    CAM
    title :
    UNSET
hour = 60*60

precip_in_m = lens.PRECT * (6 * hour)
precip_in_m
Show/Hide data repr Show/Hide attributes
xarray.DataArray
'PRECT'
  • member_id: 36
  • time: 23360
  • lat: 192
  • lon: 288
  • dask.array<chunksize=(2, 504, 192, 288), meta=np.ndarray>
    Array Chunk
    Bytes 186.01 GB 222.95 MB
    Shape (36, 23360, 192, 288) (2, 504, 192, 288)
    Count 1693 Tasks 846 Chunks
    Type float32 numpy.ndarray
    36 1 288 192 23360
    • lat
      (lat)
      float64
      -90.0 -89.06 -88.12 ... 89.06 90.0
      long_name :
      latitude
      units :
      degrees_north
      array([-90.      , -89.057592, -88.115183, -87.172775, -86.230366, -85.287958,
             -84.34555 , -83.403141, -82.460733, -81.518325, -80.575916, -79.633508,
             -78.691099, -77.748691, -76.806283, -75.863874, -74.921466, -73.979058,
             -73.036649, -72.094241, -71.151832, -70.209424, -69.267016, -68.324607,
             -67.382199, -66.439791, -65.497382, -64.554974, -63.612565, -62.670157,
             -61.727749, -60.78534 , -59.842932, -58.900524, -57.958115, -57.015707,
             -56.073298, -55.13089 , -54.188482, -53.246073, -52.303665, -51.361257,
             -50.418848, -49.47644 , -48.534031, -47.591623, -46.649215, -45.706806,
             -44.764398, -43.82199 , -42.879581, -41.937173, -40.994764, -40.052356,
             -39.109948, -38.167539, -37.225131, -36.282723, -35.340314, -34.397906,
             -33.455497, -32.513089, -31.570681, -30.628272, -29.685864, -28.743455,
             -27.801047, -26.858639, -25.91623 , -24.973822, -24.031414, -23.089005,
             -22.146597, -21.204188, -20.26178 , -19.319372, -18.376963, -17.434555,
             -16.492147, -15.549738, -14.60733 , -13.664921, -12.722513, -11.780105,
             -10.837696,  -9.895288,  -8.95288 ,  -8.010471,  -7.068063,  -6.125654,
              -5.183246,  -4.240838,  -3.298429,  -2.356021,  -1.413613,  -0.471204,
               0.471204,   1.413613,   2.356021,   3.298429,   4.240838,   5.183246,
               6.125654,   7.068063,   8.010471,   8.95288 ,   9.895288,  10.837696,
              11.780105,  12.722513,  13.664921,  14.60733 ,  15.549738,  16.492147,
              17.434555,  18.376963,  19.319372,  20.26178 ,  21.204188,  22.146597,
              23.089005,  24.031414,  24.973822,  25.91623 ,  26.858639,  27.801047,
              28.743455,  29.685864,  30.628272,  31.570681,  32.513089,  33.455497,
              34.397906,  35.340314,  36.282723,  37.225131,  38.167539,  39.109948,
              40.052356,  40.994764,  41.937173,  42.879581,  43.82199 ,  44.764398,
              45.706806,  46.649215,  47.591623,  48.534031,  49.47644 ,  50.418848,
              51.361257,  52.303665,  53.246073,  54.188482,  55.13089 ,  56.073298,
              57.015707,  57.958115,  58.900524,  59.842932,  60.78534 ,  61.727749,
              62.670157,  63.612565,  64.554974,  65.497382,  66.439791,  67.382199,
              68.324607,  69.267016,  70.209424,  71.151832,  72.094241,  73.036649,
              73.979058,  74.921466,  75.863874,  76.806283,  77.748691,  78.691099,
              79.633508,  80.575916,  81.518325,  82.460733,  83.403141,  84.34555 ,
              85.287958,  86.230366,  87.172775,  88.115183,  89.057592,  90.      ])
    • lon
      (lon)
      float64
      0.0 1.25 2.5 ... 356.2 357.5 358.8
      long_name :
      longitude
      units :
      degrees_east
      array([  0.  ,   1.25,   2.5 , ..., 356.25, 357.5 , 358.75])
    • member_id
      (member_id)
      int64
      1 2 3 4 5 6 ... 31 32 33 34 35 104
      array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,  14,
              15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,  27,  28,
              29,  30,  31,  32,  33,  34,  35, 104])
    • time
      (time)
      object
      1990-01-01 06:00:00 ... 2006-01-01 00:00:00
      bounds :
      time_bnds
      long_name :
      time
      array([cftime.DatetimeNoLeap(1990-01-01 06:00:00),
             cftime.DatetimeNoLeap(1990-01-01 12:00:00),
             cftime.DatetimeNoLeap(1990-01-01 18:00:00), ...,
             cftime.DatetimeNoLeap(2005-12-31 12:00:00),
             cftime.DatetimeNoLeap(2005-12-31 18:00:00),
             cftime.DatetimeNoLeap(2006-01-01 00:00:00)], dtype=object)

We’ll select the first member for comparison with the ERA5 histogram.

lens_hist = histogram(
    precip_in_m.isel(member_id=0).rename("tp_6hr"),
    bins=[tp_6hr_bins], dim=["lon"]
).mean(dim=('time'))
lens_hist.data
Array Chunk
Bytes 76.80 kB 76.80 kB
Shape (192, 50) (192, 50)
Count 2369 Tasks 1 Chunks
Type float64 numpy.ndarray
50 192

Note that we’re using the aws_client, because LENS is stored in an AWS region.

lens_hist_ = aws_client.compute(lens_hist)

Compare results

Let’s plot the histograms for both the ERA5 and LENS data. These are small results so it’s safe to transfer the result from the cluster to the client machine for plotting.

lens_tp_hist_ = lens_hist_.result()
era5_tp_hist_ = era5_tp_hist_.result()

For ERA5:

era5_tp_hist_[: ,1:].plot(xscale='log');
../../_images/multicloud_40_0.png

And for LENS:

lens_tp_hist_[: ,1:].plot(xscale='log');
../../_images/multicloud_42_0.png

Cleanup

Closing the clients will free all our resources.

aws_client.close()
aws.close()

gcp_client.close()
gcp.close()

Behind the Scenes

We deployed some infrastructure to make this notebook runnable. In line with one of Pangeo’s guiding principles, each of these technologies has an open architechture.

From low-level to high-level

  • Terraform provides the tools for provisioning the cloud resources needed for the clusters.

  • Kubernetes provides the container orchestration for deploying the Dask Clusters. We created kubernetes clusters in AWS’s us-west-2 and GCP’s us-central1 regsions.

  • Dask-Gatway provides centralized, secure access to Dask Clusters. These clusters were deployed using helm on two Kubernetes clusters.

  • Dask provides scalable, distributed computation for analyzing these large datasets

  • xarray provides high-level APIs and high-performance data structures for working with this data

  • Intake, gcsfs, s3fs provide catalogs for data discover and libraries for loading that data

  • Jupyterlab provides a user interface for interactive computing. The client laptop interacts with the clusters through Jupyterlab.

All of the resources for this demo are available at https://github.com/pangeo-data/multicloud-demo.